Skip to content

Sorted-batch evaluators for LinearInterpolation and CubicSpline (~9x speedup)#527

Draft
ChrisRackauckas-Claude wants to merge 1 commit into
SciML:masterfrom
ChrisRackauckas-Claude:sorted-batch-linear-cubic
Draft

Sorted-batch evaluators for LinearInterpolation and CubicSpline (~9x speedup)#527
ChrisRackauckas-Claude wants to merge 1 commit into
SciML:masterfrom
ChrisRackauckas-Claude:sorted-batch-linear-cubic

Conversation

@ChrisRackauckas-Claude
Copy link
Copy Markdown
Contributor

Summary

Extends the sorted-batch fast path introduced in #526 (AkimaInterpolation) to two more piecewise interpolation types:

  • LinearInterpolation
  • CubicSpline

When the query vector tt is sorted and the extrapolation modes are simple (None, Constant, Linear, Extension), walks the knots and queries in lockstep in O(n+m) instead of running a per-query binary search via the iguesser. Falls back to map! when the inputs don't fit the fast path: unsorted tt, Periodic/Reflective extrapolation, or (for LinearInterpolation) u containing NaN — preserving the existing NaN-handling semantics from _interpolate(::LinearInterpolation, ...).

The scalar _interpolate, derivative, integral, AD, and plotting paths are unchanged. Output matches the per-point path to floating-point round-off.

Benchmarks

Julia 1.10, n=256 knots, m=4096 sorted queries, BenchmarkTools minimum:

Type before after speedup
LinearInterpolation 126 μs 13.6 μs 9.3×
CubicSpline 167 μs 19.7 μs 8.5×

For reference, AkimaInterpolation (post-#526) is at 16.1 μs on the same setup — Linear edges it out slightly with the simpler polynomial; CubicSpline trails due to the extra arithmetic per query.

Unsorted-input fallback stays at ~450 μs for both (issorted short-circuits on the first out-of-order pair, then map! cost dominates).

Why these two

Of the piecewise types, these have the highest per-query baseline cost relative to the binary-search overhead, so they get the largest win from the lockstep walk. Hermite splines (CubicHermiteSpline, QuinticHermiteSpline) and QuadraticSpline are equally amenable to the same pattern — happy to do them as a follow-up if this design is accepted.

What is in this PR

  • src/interpolation_methods.jl: adds (A::LinearInterpolation{<:AbstractVector{<:Number}})(out, tt) and (A::CubicSpline{<:AbstractVector{<:Number}})(out, tt) callables, each with their own _*_eval_fast_applicable predicate and _*_eval_sorted! inner loop. The Linear and Extension extrapolation cases collapse to one branch for LinearInterpolation (a line is linear); they're kept separate for CubicSpline since the boundary slope and the continued cubic differ.
  • test/interpolation_tests.jl: new "Sorted-batch evaluator" sub-testsets in the Linear and CubicSpline sections, mirroring the coverage from Optimize AkimaInterpolation constructor (3.6-4.1x) and add sorted-batch evaluator (8-9x) #526 (all fast-path extrapolation modes paired left × right, knot pass-through, Periodic/Reflective fallback, None throwing, unsorted fallback, DimensionMismatch, plus cache_parameters=true for CubicSpline and the NaN-in-u fallback for Linear).

Test plan

  • CI passes
  • Pkg.test() all groups pass locally on Julia 1.10 (Core 2738 / Methods 42151 / Extensions 13178 / Misc 11)
  • Output matches per-point path bitwise on the test inputs (verified via max diff print before adding the testset)
  • Runic-formatted

Please ignore until reviewed by @ChrisRackauckas.

🤖 Generated with Claude Code

Extends the pattern from SciML#526 to two more piecewise interpolation types.
When the query vector is sorted and the extrapolation modes are simple
(None / Constant / Linear / Extension), walk the knots and queries in
lockstep in O(n+m) instead of running a per-query binary search via the
iguesser. Falls back to `map!` when the inputs don't fit the fast path:
unsorted `tt`, Periodic/Reflective extrapolation, or (for
LinearInterpolation) `u` containing NaN — preserving the existing
NaN-handling semantics from `_interpolate(::LinearInterpolation, ...)`.

The scalar `_interpolate`, derivative, integral, AD, and plotting paths
are unchanged. Output matches the per-point path to floating-point
round-off, verified across each fast-path extrapolation mode paired
left × right, both `cache_parameters` settings for CubicSpline, knot
pass-through, the Periodic/Reflective fallback, the `None` throwing
behavior on out-of-range sorted queries, and the NaN-in-u fallback for
LinearInterpolation.

Benchmarks (Julia 1.10, n=256, m=4096 sorted queries, BenchmarkTools
`minimum`):

  LinearInterpolation:  126 us  ->  13.6 us   9.3x
  CubicSpline:          167 us  ->  19.7 us   8.5x

Unsorted-input fallback path stays at ~450 us for both (matches the
prior `map!` cost; the `issorted` short-circuits on the first
out-of-order pair).

Co-Authored-By: Chris Rackauckas <accounts@chrisrackauckas.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants